{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 08 水果拼盘-多表拼接"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "from datetime import datetime\n",
    "# 一个cell输出多行语句\n",
    "from IPython.core.interactiveshell import InteractiveShell\n",
    "InteractiveShell.ast_node_interactivity = \"all\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 一、表的横向拼接\n",
    "表的横向拼接就是**横向**将两个表依据**公共列**拼接在一起\n",
    "\n",
    "Excel实现：**vlookup**函数\n",
    "\n",
    "Python实现：**merge()**方法\n",
    "\n",
    "### 1.1 连接表的类型\n",
    "连接表的类型关注：两个表是什么类型，分为三种情况：一对一、多对一、多对多\n",
    "#### 1.1.1 一对一"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名</th>\n",
       "      <th>学号</th>\n",
       "      <th>成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>小赵</td>\n",
       "      <td>103</td>\n",
       "      <td>550</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   名次  姓名   学号   成绩\n",
       "0   1  小张  100  650\n",
       "1   2  小王  101  600\n",
       "2   3  小李  102  578\n",
       "3   4  小赵  103  550"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>学号</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>100</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>101</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>102</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>103</td>\n",
       "      <td>四班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    学号  班级\n",
       "0  100  一班\n",
       "1  101  二班\n",
       "2  102  三班\n",
       "3  103  四班"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名</th>\n",
       "      <th>学号</th>\n",
       "      <th>成绩</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>小赵</td>\n",
       "      <td>103</td>\n",
       "      <td>550</td>\n",
       "      <td>四班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   名次  姓名   学号   成绩  班级\n",
       "0   1  小张  100  650  一班\n",
       "1   2  小王  101  600  二班\n",
       "2   3  小李  102  578  三班\n",
       "3   4  小赵  103  550  四班"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame([[1, '小张', 100, 650], [2, '小王', 101, 600], [3, '小李', 102, 578],\n",
    "                    [4, '小赵', 103, 550]], columns=['名次', '姓名', '学号', '成绩'])\n",
    "df1\n",
    "df2 = pd.DataFrame([[100, '一班'], [101, '二班'], [102, '三班'], [103, '四班']],\n",
    "                   columns=['学号', '班级'])\n",
    "df2\n",
    "pd.merge(df1, df2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.1.2 多对一"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>学号</th>\n",
       "      <th>f_成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   姓名   学号  f_成绩\n",
       "0  小张  100   650\n",
       "1  小王  101   600\n",
       "2  小李  102   578"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>学号</th>\n",
       "      <th>e_成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>100</td>\n",
       "      <td>586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>100</td>\n",
       "      <td>602</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>101</td>\n",
       "      <td>691</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>101</td>\n",
       "      <td>702</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>102</td>\n",
       "      <td>645</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>102</td>\n",
       "      <td>676</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    学号  e_成绩\n",
       "0  100   586\n",
       "1  100   602\n",
       "2  101   691\n",
       "3  101   702\n",
       "4  102   645\n",
       "5  102   676"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>学号</th>\n",
       "      <th>f_成绩</th>\n",
       "      <th>e_成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "      <td>586</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "      <td>602</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "      <td>691</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "      <td>702</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "      <td>645</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "      <td>676</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   姓名   学号  f_成绩  e_成绩\n",
       "0  小张  100   650   586\n",
       "1  小张  100   650   602\n",
       "2  小王  101   600   691\n",
       "3  小王  101   600   702\n",
       "4  小李  102   578   645\n",
       "5  小李  102   578   676"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame([['小张', 100, 650], ['小王', 101, 600], ['小李', 102, 578]],\n",
    "                   columns=['姓名', '学号', 'f_成绩'])\n",
    "df1\n",
    "df2 = pd.DataFrame([[100, 586], [100, 602], [101, 691], [101, 702], [102, 645], [102, 676]],\n",
    "                   columns=['学号', 'e_成绩'])\n",
    "df2\n",
    "pd.merge(df1, df2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.1.3 多对多\n",
    "多对多相当于多个多对一"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>学号</th>\n",
       "      <th>f_成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>610</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>542</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   姓名   学号  f_成绩\n",
       "0  小张  100   650\n",
       "1  小张  100   610\n",
       "2  小王  101   600\n",
       "3  小李  102   578\n",
       "4  小李  102   542"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>学号</th>\n",
       "      <th>e_成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>100</td>\n",
       "      <td>610</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>102</td>\n",
       "      <td>542</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    学号  e_成绩\n",
       "0  100   650\n",
       "1  100   610\n",
       "2  101   600\n",
       "3  102   578\n",
       "4  102   542"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>学号</th>\n",
       "      <th>f_成绩</th>\n",
       "      <th>e_成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "      <td>650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "      <td>610</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>610</td>\n",
       "      <td>650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>610</td>\n",
       "      <td>610</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "      <td>600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "      <td>578</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "      <td>542</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>7</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>542</td>\n",
       "      <td>578</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>8</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>542</td>\n",
       "      <td>542</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   姓名   学号  f_成绩  e_成绩\n",
       "0  小张  100   650   650\n",
       "1  小张  100   650   610\n",
       "2  小张  100   610   650\n",
       "3  小张  100   610   610\n",
       "4  小王  101   600   600\n",
       "5  小李  102   578   578\n",
       "6  小李  102   578   542\n",
       "7  小李  102   542   578\n",
       "8  小李  102   542   542"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame([['小张', 100, 650], ['小张', 100, 610], ['小王', 101, 600], \n",
    "                    ['小李', 102, 578], ['小李', 102, 542]],\n",
    "                   columns=['姓名', '学号', 'f_成绩'])\n",
    "df1\n",
    "df2 = pd.DataFrame([[100, 650], [100, 610], [101, 600], [102, 578], [102, 542]],\n",
    "                   columns=['学号', 'e_成绩'])\n",
    "df2\n",
    "pd.merge(df1, df2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.2 连接键的类型\n",
    "默认以公共列作为连接键，**on**指定连接键，一般指定的也是两个表的公共列\n",
    "\n",
    "#### 1.2.1 on来指定连接键"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>学号</th>\n",
       "      <th>f_成绩</th>\n",
       "      <th>e_成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "      <td>650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "      <td>610</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>610</td>\n",
       "      <td>650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>610</td>\n",
       "      <td>610</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "      <td>600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "      <td>578</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "      <td>542</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>7</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>542</td>\n",
       "      <td>578</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>8</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>542</td>\n",
       "      <td>542</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   姓名   学号  f_成绩  e_成绩\n",
       "0  小张  100   650   650\n",
       "1  小张  100   650   610\n",
       "2  小张  100   610   650\n",
       "3  小张  100   610   610\n",
       "4  小王  101   600   600\n",
       "5  小李  102   578   578\n",
       "6  小李  102   578   542\n",
       "7  小李  102   542   578\n",
       "8  小李  102   542   542"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.merge(df1, df2, on='学号')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名</th>\n",
       "      <th>学号</th>\n",
       "      <th>成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>小赵</td>\n",
       "      <td>103</td>\n",
       "      <td>550</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   名次  姓名   学号   成绩\n",
       "0   1  小张  100  650\n",
       "1   2  小王  101  600\n",
       "2   3  小李  102  578\n",
       "3   4  小赵  103  550"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>学号</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>小赵</td>\n",
       "      <td>103</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   姓名   学号  班级\n",
       "0  小张  100  一班\n",
       "1  小王  101  一班\n",
       "2  小李  102  二班\n",
       "3  小赵  103  三班"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名</th>\n",
       "      <th>学号</th>\n",
       "      <th>成绩</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>小赵</td>\n",
       "      <td>103</td>\n",
       "      <td>550</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   名次  姓名   学号   成绩  班级\n",
       "0   1  小张  100  650  一班\n",
       "1   2  小王  101  600  一班\n",
       "2   3  小李  102  578  二班\n",
       "3   4  小赵  103  550  三班"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame([[1, '小张', 100, 650], [2, '小王', 101, 600], [3, '小李', 102, 578],\n",
    "                    [4, '小赵', 103, 550]], columns=['名次', '姓名', '学号', '成绩'])\n",
    "df1\n",
    "df2 = pd.DataFrame([['小张', 100, '一班'], ['小王', 101, '一班'], ['小李', 102, '二班'], ['小赵', 103, '三班']],\n",
    "                   columns=['姓名', '学号', '班级'])\n",
    "df2\n",
    "pd.merge(df1, df2, on=['姓名', '学号'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.2.2 分别指定左右连接键\n",
    "**left_on**、**right_on**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名</th>\n",
       "      <th>编号</th>\n",
       "      <th>成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>小赵</td>\n",
       "      <td>103</td>\n",
       "      <td>550</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   名次  姓名   编号   成绩\n",
       "0   1  小张  100  650\n",
       "1   2  小王  101  600\n",
       "2   3  小李  102  578\n",
       "3   4  小赵  103  550"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>学号</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>100</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>101</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>102</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>103</td>\n",
       "      <td>四班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    学号  班级\n",
       "0  100  一班\n",
       "1  101  二班\n",
       "2  102  三班\n",
       "3  103  四班"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名</th>\n",
       "      <th>编号</th>\n",
       "      <th>成绩</th>\n",
       "      <th>学号</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "      <td>100</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "      <td>101</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "      <td>102</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>小赵</td>\n",
       "      <td>103</td>\n",
       "      <td>550</td>\n",
       "      <td>103</td>\n",
       "      <td>四班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   名次  姓名   编号   成绩   学号  班级\n",
       "0   1  小张  100  650  100  一班\n",
       "1   2  小王  101  600  101  二班\n",
       "2   3  小李  102  578  102  三班\n",
       "3   4  小赵  103  550  103  四班"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame([[1, '小张', 100, 650], [2, '小王', 101, 600], [3, '小李', 102, 578],\n",
    "                    [4, '小赵', 103, 550]], columns=['名次', '姓名', '编号', '成绩'])\n",
    "df1\n",
    "df2 = pd.DataFrame([[100, '一班'], [101, '二班'], [102, '三班'], [103, '四班']],\n",
    "                   columns=['学号', '班级'])\n",
    "df2\n",
    "pd.merge(df1, df2, left_on='编号', right_on='学号')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.2.3 把索引列当做连接键\n",
    "**left_index**、**right_index**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名</th>\n",
       "      <th>成绩</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>编号</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>100</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>101</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>102</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>578</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>103</td>\n",
       "      <td>4</td>\n",
       "      <td>小赵</td>\n",
       "      <td>550</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     名次  姓名   成绩\n",
       "编号              \n",
       "100   1  小张  650\n",
       "101   2  小王  600\n",
       "102   3  小李  578\n",
       "103   4  小赵  550"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>学号</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>100</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>101</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>102</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>103</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     班级\n",
       "学号     \n",
       "100  一班\n",
       "101  一班\n",
       "102  二班\n",
       "103  三班"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名</th>\n",
       "      <th>成绩</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>编号</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>100</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>650</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>101</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>600</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>102</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>578</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>103</td>\n",
       "      <td>4</td>\n",
       "      <td>小赵</td>\n",
       "      <td>550</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     名次  姓名   成绩  班级\n",
       "编号                  \n",
       "100   1  小张  650  一班\n",
       "101   2  小王  600  一班\n",
       "102   3  小李  578  二班\n",
       "103   4  小赵  550  三班"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame([[1, '小张', 650], [2, '小王', 600], [3, '小李', 578], [4, '小赵', 550]],\n",
    "                  index=[100, 101, 102, 103], columns=['名次', '姓名', '成绩'])\n",
    "df1.index.name = '编号'\n",
    "df1\n",
    "\n",
    "df2 = pd.DataFrame([['一班'], ['一班'], ['二班'], ['三班']], index=[100, 101, 102, 103], columns=['班级'])\n",
    "df2.index.name = '学号'\n",
    "df2\n",
    "\n",
    "pd.merge(df1, df2, left_index=True, right_index=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>学号</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>100</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>101</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>102</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>103</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    学号  班级\n",
       "0  100  一班\n",
       "1  101  一班\n",
       "2  102  二班\n",
       "3  103  三班"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名</th>\n",
       "      <th>成绩</th>\n",
       "      <th>学号</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>650</td>\n",
       "      <td>100</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>600</td>\n",
       "      <td>101</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>578</td>\n",
       "      <td>102</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>小赵</td>\n",
       "      <td>550</td>\n",
       "      <td>103</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   名次  姓名   成绩   学号  班级\n",
       "0   1  小张  650  100  一班\n",
       "1   2  小王  600  101  一班\n",
       "2   3  小李  578  102  二班\n",
       "3   4  小赵  550  103  三班"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = pd.DataFrame([[100, '一班'], [101, '一班'], [102, '二班'], [103, '三班']], columns=['学号', '班级'])\n",
    "df2\n",
    "\n",
    "pd.merge(df1, df2, left_index=True, right_on='学号')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.3 连接方式\n",
    "参数how指明具体的连接方式\n",
    "#### 1.3.1 内连接（inner）\n",
    "内连接取两个表中的公共部分，默认内连接"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名</th>\n",
       "      <th>学号</th>\n",
       "      <th>成绩</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>小赵</td>\n",
       "      <td>103</td>\n",
       "      <td>550</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   名次  姓名   学号   成绩\n",
       "0   1  小张  100  650\n",
       "1   2  小王  101  600\n",
       "2   3  小李  102  578\n",
       "3   4  小赵  103  550"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>学号</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>小钱</td>\n",
       "      <td>104</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   姓名   学号  班级\n",
       "0  小张  100  一班\n",
       "1  小王  101  一班\n",
       "2  小李  102  二班\n",
       "3  小钱  104  三班"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名_x</th>\n",
       "      <th>学号</th>\n",
       "      <th>成绩</th>\n",
       "      <th>姓名_y</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "      <td>小张</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "      <td>小王</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "      <td>小李</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   名次 姓名_x   学号   成绩 姓名_y  班级\n",
       "0   1   小张  100  650   小张  一班\n",
       "1   2   小王  101  600   小王  一班\n",
       "2   3   小李  102  578   小李  二班"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame([[1, '小张', 100, 650], [2, '小王', 101, 600], [3, '小李', 102, 578],\n",
    "                    [4, '小赵', 103, 550]], columns=['名次', '姓名', '学号', '成绩'])\n",
    "df1\n",
    "df2 = pd.DataFrame([['小张', 100, '一班'], ['小王', 101, '一班'], ['小李', 102, '二班'], ['小钱', 104, '三班']], \n",
    "                   columns=['姓名', '学号', '班级'])\n",
    "df2\n",
    "pd.merge(df1, df2, on='学号', how='inner')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.3.2 左连接（left）\n",
    "以左表为基础，右表往左表上拼接，右表中没有学号为103的信息，所以为NaN"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名_x</th>\n",
       "      <th>学号</th>\n",
       "      <th>成绩</th>\n",
       "      <th>姓名_y</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "      <td>小张</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "      <td>小王</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "      <td>小李</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "      <td>小赵</td>\n",
       "      <td>103</td>\n",
       "      <td>550</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   名次 姓名_x   学号   成绩 姓名_y   班级\n",
       "0   1   小张  100  650   小张   一班\n",
       "1   2   小王  101  600   小王   一班\n",
       "2   3   小李  102  578   小李   二班\n",
       "3   4   小赵  103  550  NaN  NaN"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.merge(df1, df2, on='学号', how='left')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.3.3 右连接（right）\n",
    "以右表为基础，左表往右表上拼接，左表中没有学号为104的信息，所以是NaN"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名_x</th>\n",
       "      <th>学号</th>\n",
       "      <th>成绩</th>\n",
       "      <th>姓名_y</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650.0</td>\n",
       "      <td>小张</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2.0</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600.0</td>\n",
       "      <td>小王</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3.0</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578.0</td>\n",
       "      <td>小李</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>104</td>\n",
       "      <td>NaN</td>\n",
       "      <td>小钱</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    名次 姓名_x   学号     成绩 姓名_y  班级\n",
       "0  1.0   小张  100  650.0   小张  一班\n",
       "1  2.0   小王  101  600.0   小王  一班\n",
       "2  3.0   小李  102  578.0   小李  二班\n",
       "3  NaN  NaN  104    NaN   小钱  三班"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.merge(df1, df2, on='学号', how='right')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1.3.4 外连接（outer）\n",
    "外连接是取两个表的并集"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名_x</th>\n",
       "      <th>学号</th>\n",
       "      <th>成绩</th>\n",
       "      <th>姓名_y</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650.0</td>\n",
       "      <td>小张</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2.0</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600.0</td>\n",
       "      <td>小王</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3.0</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578.0</td>\n",
       "      <td>小李</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>4.0</td>\n",
       "      <td>小赵</td>\n",
       "      <td>103</td>\n",
       "      <td>550.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>104</td>\n",
       "      <td>NaN</td>\n",
       "      <td>小钱</td>\n",
       "      <td>三班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    名次 姓名_x   学号     成绩 姓名_y   班级\n",
       "0  1.0   小张  100  650.0   小张   一班\n",
       "1  2.0   小王  101  600.0   小王   一班\n",
       "2  3.0   小李  102  578.0   小李   二班\n",
       "3  4.0   小赵  103  550.0  NaN  NaN\n",
       "4  NaN  NaN  104    NaN   小钱   三班"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.merge(df1, df2, on='学号', how='outer')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.4 重复列名处理\n",
    "两个表进行连接，会遇到重复列名的情况，pd.merge会自动添加_x,_y,_z\n",
    "\n",
    "自定义重复列名：suffixes=['_x', '_y']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>名次</th>\n",
       "      <th>姓名_L</th>\n",
       "      <th>学号</th>\n",
       "      <th>成绩</th>\n",
       "      <th>姓名_R</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>小张</td>\n",
       "      <td>100</td>\n",
       "      <td>650</td>\n",
       "      <td>小张</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>小王</td>\n",
       "      <td>101</td>\n",
       "      <td>600</td>\n",
       "      <td>小王</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>小李</td>\n",
       "      <td>102</td>\n",
       "      <td>578</td>\n",
       "      <td>小李</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   名次 姓名_L   学号   成绩 姓名_R  班级\n",
       "0   1   小张  100  650   小张  一班\n",
       "1   2   小王  101  600   小王  一班\n",
       "2   3   小李  102  578   小李  二班"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.merge(df1, df2, on='学号', how='inner', suffixes=['_L', '_R'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 二、表的纵向拼接\n",
    "表的纵向连接和横向连接相对应，指的是在垂直方向上进行拼接\n",
    "\n",
    "一般应用场景为：分离若干个结构相同的数据合并成一个数据表，比如：两个班的花名册，两个表的结构是一样的，需要合并为一个表"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.1 普通合并\n",
    "将两个表以列表的形式传递给pd.concat()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>编号</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>许丹</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>李旭文</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>程成</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>赵涛</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     姓名  班级\n",
       "编号         \n",
       "0    许丹  一班\n",
       "1   李旭文  一班\n",
       "2    程成  一班\n",
       "3    赵涛  一班"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>编号</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>赵义</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>李鹏</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>卫来</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>葛颜</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    姓名  班级\n",
       "编号        \n",
       "0   赵义  二班\n",
       "1   李鹏  二班\n",
       "2   卫来  二班\n",
       "3   葛颜  二班"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>编号</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>许丹</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>李旭文</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>程成</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>赵涛</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>赵义</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>李鹏</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>卫来</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>葛颜</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     姓名  班级\n",
       "编号         \n",
       "0    许丹  一班\n",
       "1   李旭文  一班\n",
       "2    程成  一班\n",
       "3    赵涛  一班\n",
       "0    赵义  二班\n",
       "1    李鹏  二班\n",
       "2    卫来  二班\n",
       "3    葛颜  二班"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame([['许丹', '一班'], ['李旭文', '一班'], ['程成', '一班'], ['赵涛', '一班']], columns=['姓名', '班级'])\n",
    "df1.index.name = '编号'\n",
    "df1\n",
    "df2 = pd.DataFrame([['赵义', '二班'], ['李鹏', '二班'], ['卫来', '二班'], ['葛颜', '二班']], columns=['姓名', '班级'])\n",
    "df2.index.name = '编号'\n",
    "df2\n",
    "pd.concat([df1, df2])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.2 索引设置\n",
    "参数ignore_index=True，生成一组新的索引，不保留原来的索引"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>许丹</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>李旭文</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>程成</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>赵涛</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>赵义</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>李鹏</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>卫来</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>7</td>\n",
       "      <td>葛颜</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    姓名  班级\n",
       "0   许丹  一班\n",
       "1  李旭文  一班\n",
       "2   程成  一班\n",
       "3   赵涛  一班\n",
       "4   赵义  二班\n",
       "5   李鹏  二班\n",
       "6   卫来  二班\n",
       "7   葛颜  二班"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.concat([df1, df2], ignore_index=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.3 重叠数据合并"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>编号</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>许丹</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>李旭文</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>卫来</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>程成</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>赵涛</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     姓名  班级\n",
       "编号         \n",
       "0    许丹  一班\n",
       "1   李旭文  一班\n",
       "2    卫来  二班\n",
       "3    程成  一班\n",
       "4    赵涛  一班"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>编号</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>赵义</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>许丹</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>李鹏</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>卫来</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>葛颜</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    姓名  班级\n",
       "编号        \n",
       "0   赵义  二班\n",
       "1   许丹  一班\n",
       "2   李鹏  二班\n",
       "3   卫来  二班\n",
       "4   葛颜  二班"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>姓名</th>\n",
       "      <th>班级</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>许丹</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>李旭文</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>卫来</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>程成</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>赵涛</td>\n",
       "      <td>一班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>赵义</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>7</td>\n",
       "      <td>李鹏</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>9</td>\n",
       "      <td>葛颜</td>\n",
       "      <td>二班</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    姓名  班级\n",
       "0   许丹  一班\n",
       "1  李旭文  一班\n",
       "2   卫来  二班\n",
       "3   程成  一班\n",
       "4   赵涛  一班\n",
       "5   赵义  二班\n",
       "7   李鹏  二班\n",
       "9   葛颜  二班"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = pd.DataFrame([['许丹', '一班'], ['李旭文', '一班'], ['卫来', '二班'], ['程成', '一班'], ['赵涛', '一班']], columns=['姓名', '班级'])\n",
    "df1.index.name = '编号'\n",
    "df1\n",
    "df2 = pd.DataFrame([['赵义', '二班'], ['许丹', '一班'], ['李鹏', '二班'], ['卫来', '二班'], ['葛颜', '二班']], columns=['姓名', '班级'])\n",
    "df2.index.name = '编号'\n",
    "df2\n",
    "pd.concat([df1, df2], ignore_index=True).drop_duplicates()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
